{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "view-in-github"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/mongodb-developer/GenAI-Showcase/blob/main/notebooks/rag/TraderJoesFallAIPartyPlanner_PlaywrightLlamaIndexVectorSearch.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "H70oZDIZ7IJM"
      },
      "source": [
        "[![View Article](https://img.shields.io/badge/View%20Article-blue)](https://www.mongodb.com/developer/products/mongodb/trader-joes-llamaindex-vector-search/)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ChoXaygi6hKI"
      },
      "source": [
        "## Overview\n",
        "In this tutorial we are going to scrape the Trader Joe's website for all their Fall Faves using Playwright, and then create an AI party planner using the LlamaIndex x MongoDB Vector Search integration and a chat engine. This will help us figure out easily which Trader Joe's fall faves would be perfect for our fall festivities.\n",
        "\n",
        "What's Covered\n",
        "\n",
        "*   Building a Trader Joe’s AI party planner using Playwright, LlamaIndex, and  MongoDB Atlas Vector Search\n",
        "*   Scraping Trader Joe’s fall items with Playwright and formatting them for chatbot use\n",
        "*   Setting up and embedding product data in MongoDB Atlas Vector Store for semantic search\n",
        "*   Creating a Retrieval-Augmented Generation (RAG) chatbot to answer party planning questions\n",
        "*   Adding interactive Chat Engine functionality for back-and-forth Q&A about fall party items\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PLZbFSlUg2Sf"
      },
      "source": [
        "## Part 1: Scrape the Trader Joes website for their fall items"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gd7dny8JR51I"
      },
      "source": [
        "First, let’s go ahead and install Playwright:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "collapsed": true,
        "id": "ii-DUT3OgyC-",
        "outputId": "e8cfbc06-51b5-4847-838b-bf8eeef52d18"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Collecting playwright\n",
            "  Downloading playwright-1.48.0-py3-none-manylinux1_x86_64.whl.metadata (3.5 kB)\n",
            "Requirement already satisfied: greenlet==3.1.1 in /usr/local/lib/python3.10/dist-packages (from playwright) (3.1.1)\n",
            "Collecting pyee==12.0.0 (from playwright)\n",
            "  Downloading pyee-12.0.0-py3-none-any.whl.metadata (2.8 kB)\n",
            "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from pyee==12.0.0->playwright) (4.12.2)\n",
            "Downloading playwright-1.48.0-py3-none-manylinux1_x86_64.whl (38.2 MB)\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m38.2/38.2 MB\u001b[0m \u001b[31m45.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hDownloading pyee-12.0.0-py3-none-any.whl (14 kB)\n",
            "Installing collected packages: pyee, playwright\n",
            "Successfully installed playwright-1.48.0 pyee-12.0.0\n",
            "Downloading Chromium 130.0.6723.31 (playwright build v1140)\u001b[2m from https://playwright.azureedge.net/builds/chromium/1140/chromium-linux.zip\u001b[22m\n",
            "\u001b[1G164.5 MiB [] 0% 0.0s\u001b[0K\u001b[1G164.5 MiB [] 0% 33.9s\u001b[0K\u001b[1G164.5 MiB [] 0% 29.2s\u001b[0K\u001b[1G164.5 MiB [] 0% 16.1s\u001b[0K\u001b[1G164.5 MiB [] 0% 9.0s\u001b[0K\u001b[1G164.5 MiB [] 1% 6.6s\u001b[0K\u001b[1G164.5 MiB [] 1% 5.5s\u001b[0K\u001b[1G164.5 MiB [] 2% 4.6s\u001b[0K\u001b[1G164.5 MiB [] 2% 5.0s\u001b[0K\u001b[1G164.5 MiB [] 2% 5.1s\u001b[0K\u001b[1G164.5 MiB [] 3% 4.9s\u001b[0K\u001b[1G164.5 MiB [] 4% 4.6s\u001b[0K\u001b[1G164.5 MiB [] 4% 4.4s\u001b[0K\u001b[1G164.5 MiB [] 5% 4.1s\u001b[0K\u001b[1G164.5 MiB [] 6% 3.8s\u001b[0K\u001b[1G164.5 MiB [] 6% 3.6s\u001b[0K\u001b[1G164.5 MiB [] 7% 3.5s\u001b[0K\u001b[1G164.5 MiB [] 8% 3.3s\u001b[0K\u001b[1G164.5 MiB [] 8% 3.2s\u001b[0K\u001b[1G164.5 MiB [] 9% 3.1s\u001b[0K\u001b[1G164.5 MiB [] 10% 3.0s\u001b[0K\u001b[1G164.5 MiB [] 11% 2.9s\u001b[0K\u001b[1G164.5 MiB [] 11% 2.8s\u001b[0K\u001b[1G164.5 MiB [] 12% 2.8s\u001b[0K\u001b[1G164.5 MiB [] 12% 2.7s\u001b[0K\u001b[1G164.5 MiB [] 13% 2.7s\u001b[0K\u001b[1G164.5 MiB [] 14% 2.6s\u001b[0K\u001b[1G164.5 MiB [] 15% 2.5s\u001b[0K\u001b[1G164.5 MiB [] 17% 2.4s\u001b[0K\u001b[1G164.5 MiB [] 18% 2.3s\u001b[0K\u001b[1G164.5 MiB [] 19% 2.2s\u001b[0K\u001b[1G164.5 MiB [] 20% 2.1s\u001b[0K\u001b[1G164.5 MiB [] 21% 2.0s\u001b[0K\u001b[1G164.5 MiB [] 22% 2.0s\u001b[0K\u001b[1G164.5 MiB [] 23% 1.9s\u001b[0K\u001b[1G164.5 MiB [] 24% 1.9s\u001b[0K\u001b[1G164.5 MiB [] 25% 1.8s\u001b[0K\u001b[1G164.5 MiB [] 26% 1.9s\u001b[0K\u001b[1G164.5 MiB [] 26% 1.8s\u001b[0K\u001b[1G164.5 MiB [] 27% 1.8s\u001b[0K\u001b[1G164.5 MiB [] 28% 1.8s\u001b[0K\u001b[1G164.5 MiB [] 29% 1.7s\u001b[0K\u001b[1G164.5 MiB [] 30% 1.7s\u001b[0K\u001b[1G164.5 MiB [] 31% 1.7s\u001b[0K\u001b[1G164.5 MiB [] 32% 1.6s\u001b[0K\u001b[1G164.5 MiB [] 33% 1.6s\u001b[0K\u001b[1G164.5 MiB [] 34% 1.6s\u001b[0K\u001b[1G164.5 MiB [] 35% 1.6s\u001b[0K\u001b[1G164.5 MiB [] 36% 1.5s\u001b[0K\u001b[1G164.5 MiB [] 37% 1.5s\u001b[0K\u001b[1G164.5 MiB [] 38% 1.5s\u001b[0K\u001b[1G164.5 MiB [] 39% 1.4s\u001b[0K\u001b[1G164.5 MiB [] 40% 1.4s\u001b[0K\u001b[1G164.5 MiB [] 41% 1.6s\u001b[0K\u001b[1G164.5 MiB [] 42% 1.6s\u001b[0K\u001b[1G164.5 MiB [] 43% 1.5s\u001b[0K\u001b[1G164.5 MiB [] 44% 1.5s\u001b[0K\u001b[1G164.5 MiB [] 45% 1.5s\u001b[0K\u001b[1G164.5 MiB [] 46% 1.4s\u001b[0K\u001b[1G164.5 MiB [] 47% 1.4s\u001b[0K\u001b[1G164.5 MiB [] 48% 1.4s\u001b[0K\u001b[1G164.5 MiB [] 49% 1.3s\u001b[0K\u001b[1G164.5 MiB [] 50% 1.3s\u001b[0K\u001b[1G164.5 MiB [] 51% 1.3s\u001b[0K\u001b[1G164.5 MiB [] 52% 1.2s\u001b[0K\u001b[1G164.5 MiB [] 53% 1.2s\u001b[0K\u001b[1G164.5 MiB [] 54% 1.2s\u001b[0K\u001b[1G164.5 MiB [] 55% 1.1s\u001b[0K\u001b[1G164.5 MiB [] 56% 1.1s\u001b[0K\u001b[1G164.5 MiB [] 57% 1.1s\u001b[0K\u001b[1G164.5 MiB [] 58% 1.0s\u001b[0K\u001b[1G164.5 MiB [] 60% 1.0s\u001b[0K\u001b[1G164.5 MiB [] 61% 1.0s\u001b[0K\u001b[1G164.5 MiB [] 62% 0.9s\u001b[0K\u001b[1G164.5 MiB [] 63% 0.9s\u001b[0K\u001b[1G164.5 MiB [] 64% 0.9s\u001b[0K\u001b[1G164.5 MiB [] 65% 0.8s\u001b[0K\u001b[1G164.5 MiB [] 66% 0.8s\u001b[0K\u001b[1G164.5 MiB [] 67% 0.8s\u001b[0K\u001b[1G164.5 MiB [] 68% 0.7s\u001b[0K\u001b[1G164.5 MiB [] 69% 0.7s\u001b[0K\u001b[1G164.5 MiB [] 70% 0.7s\u001b[0K\u001b[1G164.5 MiB [] 71% 0.6s\u001b[0K\u001b[1G164.5 MiB [] 73% 0.6s\u001b[0K\u001b[1G164.5 MiB [] 74% 0.6s\u001b[0K\u001b[1G164.5 MiB [] 75% 0.6s\u001b[0K\u001b[1G164.5 MiB [] 76% 0.5s\u001b[0K\u001b[1G164.5 MiB [] 77% 0.5s\u001b[0K\u001b[1G164.5 MiB [] 78% 0.5s\u001b[0K\u001b[1G164.5 MiB [] 79% 0.5s\u001b[0K\u001b[1G164.5 MiB [] 80% 0.4s\u001b[0K\u001b[1G164.5 MiB [] 81% 0.4s\u001b[0K\u001b[1G164.5 MiB [] 82% 0.4s\u001b[0K\u001b[1G164.5 MiB [] 83% 0.4s\u001b[0K\u001b[1G164.5 MiB [] 84% 0.3s\u001b[0K\u001b[1G164.5 MiB [] 85% 0.3s\u001b[0K\u001b[1G164.5 MiB [] 86% 0.3s\u001b[0K\u001b[1G164.5 MiB [] 88% 0.3s\u001b[0K\u001b[1G164.5 MiB [] 89% 0.2s\u001b[0K\u001b[1G164.5 MiB [] 90% 0.2s\u001b[0K\u001b[1G164.5 MiB [] 91% 0.2s\u001b[0K\u001b[1G164.5 MiB [] 92% 0.2s\u001b[0K\u001b[1G164.5 MiB [] 94% 0.1s\u001b[0K\u001b[1G164.5 MiB [] 95% 0.1s\u001b[0K\u001b[1G164.5 MiB [] 96% 0.1s\u001b[0K\u001b[1G164.5 MiB [] 97% 0.0s\u001b[0K\u001b[1G164.5 MiB [] 99% 0.0s\u001b[0K\u001b[1G164.5 MiB [] 100% 0.0s\u001b[0K\n",
            "Chromium 130.0.6723.31 (playwright build v1140) downloaded to /root/.cache/ms-playwright/chromium-1140\n",
            "Downloading FFMPEG playwright build v1010\u001b[2m from https://playwright.azureedge.net/builds/ffmpeg/1010/ffmpeg-linux.zip\u001b[22m\n",
            "\u001b[1G2.3 MiB [] 0% 0.0s\u001b[0K\u001b[1G2.3 MiB [] 4% 0.4s\u001b[0K\u001b[1G2.3 MiB [] 10% 0.3s\u001b[0K\u001b[1G2.3 MiB [] 24% 0.1s\u001b[0K\u001b[1G2.3 MiB [] 67% 0.0s\u001b[0K\u001b[1G2.3 MiB [] 100% 0.0s\u001b[0K\n",
            "FFMPEG playwright build v1010 downloaded to /root/.cache/ms-playwright/ffmpeg-1010\n",
            "Downloading Firefox 131.0 (playwright build v1465)\u001b[2m from https://playwright.azureedge.net/builds/firefox/1465/firefox-ubuntu-22.04.zip\u001b[22m\n",
            "\u001b[1G86.7 MiB [] 0% 0.0s\u001b[0K\u001b[1G86.7 MiB [] 0% 19.0s\u001b[0K\u001b[1G86.7 MiB [] 0% 16.2s\u001b[0K\u001b[1G86.7 MiB [] 0% 7.6s\u001b[0K\u001b[1G86.7 MiB [] 1% 4.2s\u001b[0K\u001b[1G86.7 MiB [] 2% 3.0s\u001b[0K\u001b[1G86.7 MiB [] 3% 2.5s\u001b[0K\u001b[1G86.7 MiB [] 5% 2.3s\u001b[0K\u001b[1G86.7 MiB [] 5% 2.1s\u001b[0K\u001b[1G86.7 MiB [] 6% 2.1s\u001b[0K\u001b[1G86.7 MiB [] 7% 2.1s\u001b[0K\u001b[1G86.7 MiB [] 8% 2.0s\u001b[0K\u001b[1G86.7 MiB [] 10% 1.8s\u001b[0K\u001b[1G86.7 MiB [] 12% 1.5s\u001b[0K\u001b[1G86.7 MiB [] 13% 1.5s\u001b[0K\u001b[1G86.7 MiB [] 15% 1.4s\u001b[0K\u001b[1G86.7 MiB [] 16% 1.3s\u001b[0K\u001b[1G86.7 MiB [] 18% 1.3s\u001b[0K\u001b[1G86.7 MiB [] 19% 1.3s\u001b[0K\u001b[1G86.7 MiB [] 20% 1.2s\u001b[0K\u001b[1G86.7 MiB [] 22% 1.2s\u001b[0K\u001b[1G86.7 MiB [] 23% 1.1s\u001b[0K\u001b[1G86.7 MiB [] 24% 1.1s\u001b[0K\u001b[1G86.7 MiB [] 26% 1.1s\u001b[0K\u001b[1G86.7 MiB [] 27% 1.0s\u001b[0K\u001b[1G86.7 MiB [] 29% 1.0s\u001b[0K\u001b[1G86.7 MiB [] 31% 1.0s\u001b[0K\u001b[1G86.7 MiB [] 32% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 34% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 36% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 38% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 39% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 41% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 42% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 44% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 46% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 47% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 49% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 51% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 54% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 56% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 58% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 59% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 60% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 61% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 61% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 62% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 64% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 64% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 64% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 64% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 64% 1.0s\u001b[0K\u001b[1G86.7 MiB [] 64% 1.1s\u001b[0K\u001b[1G86.7 MiB [] 64% 1.2s\u001b[0K\u001b[1G86.7 MiB [] 65% 1.2s\u001b[0K\u001b[1G86.7 MiB [] 65% 1.3s\u001b[0K\u001b[1G86.7 MiB [] 66% 1.3s\u001b[0K\u001b[1G86.7 MiB [] 67% 1.3s\u001b[0K\u001b[1G86.7 MiB [] 69% 1.2s\u001b[0K\u001b[1G86.7 MiB [] 70% 1.1s\u001b[0K\u001b[1G86.7 MiB [] 72% 1.0s\u001b[0K\u001b[1G86.7 MiB [] 73% 1.0s\u001b[0K\u001b[1G86.7 MiB [] 74% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 75% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 76% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 77% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 77% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 78% 0.9s\u001b[0K\u001b[1G86.7 MiB [] 78% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 79% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 80% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 81% 0.8s\u001b[0K\u001b[1G86.7 MiB [] 81% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 82% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 83% 0.7s\u001b[0K\u001b[1G86.7 MiB [] 83% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 84% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 85% 0.6s\u001b[0K\u001b[1G86.7 MiB [] 86% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 87% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 88% 0.5s\u001b[0K\u001b[1G86.7 MiB [] 88% 0.4s\u001b[0K\u001b[1G86.7 MiB [] 89% 0.4s\u001b[0K\u001b[1G86.7 MiB [] 90% 0.4s\u001b[0K\u001b[1G86.7 MiB [] 91% 0.4s\u001b[0K\u001b[1G86.7 MiB [] 92% 0.3s\u001b[0K\u001b[1G86.7 MiB [] 93% 0.3s\u001b[0K\u001b[1G86.7 MiB [] 94% 0.2s\u001b[0K\u001b[1G86.7 MiB [] 95% 0.2s\u001b[0K\u001b[1G86.7 MiB [] 96% 0.2s\u001b[0K\u001b[1G86.7 MiB [] 97% 0.1s\u001b[0K\u001b[1G86.7 MiB [] 98% 0.1s\u001b[0K\u001b[1G86.7 MiB [] 100% 0.0s\u001b[0K\n",
            "Firefox 131.0 (playwright build v1465) downloaded to /root/.cache/ms-playwright/firefox-1465\n",
            "Downloading Webkit 18.0 (playwright build v2083)\u001b[2m from https://playwright.azureedge.net/builds/webkit/2083/webkit-ubuntu-22.04.zip\u001b[22m\n",
            "\u001b[1G90.5 MiB [] 0% 0.0s\u001b[0K\u001b[1G90.5 MiB [] 0% 19.8s\u001b[0K\u001b[1G90.5 MiB [] 0% 21.1s\u001b[0K\u001b[1G90.5 MiB [] 0% 25.5s\u001b[0K\u001b[1G90.5 MiB [] 0% 25.3s\u001b[0K\u001b[1G90.5 MiB [] 0% 16.3s\u001b[0K\u001b[1G90.5 MiB [] 1% 8.9s\u001b[0K\u001b[1G90.5 MiB [] 1% 6.5s\u001b[0K\u001b[1G90.5 MiB [] 2% 5.4s\u001b[0K\u001b[1G90.5 MiB [] 3% 4.7s\u001b[0K\u001b[1G90.5 MiB [] 4% 4.1s\u001b[0K\u001b[1G90.5 MiB [] 4% 4.2s\u001b[0K\u001b[1G90.5 MiB [] 5% 4.0s\u001b[0K\u001b[1G90.5 MiB [] 5% 3.9s\u001b[0K\u001b[1G90.5 MiB [] 6% 3.6s\u001b[0K\u001b[1G90.5 MiB [] 7% 3.5s\u001b[0K\u001b[1G90.5 MiB [] 7% 3.7s\u001b[0K\u001b[1G90.5 MiB [] 8% 3.7s\u001b[0K\u001b[1G90.5 MiB [] 8% 3.5s\u001b[0K\u001b[1G90.5 MiB [] 9% 3.5s\u001b[0K\u001b[1G90.5 MiB [] 10% 3.3s\u001b[0K\u001b[1G90.5 MiB [] 10% 3.2s\u001b[0K\u001b[1G90.5 MiB [] 11% 3.2s\u001b[0K\u001b[1G90.5 MiB [] 12% 3.1s\u001b[0K\u001b[1G90.5 MiB [] 12% 3.0s\u001b[0K\u001b[1G90.5 MiB [] 13% 2.9s\u001b[0K\u001b[1G90.5 MiB [] 14% 2.7s\u001b[0K\u001b[1G90.5 MiB [] 15% 2.6s\u001b[0K\u001b[1G90.5 MiB [] 16% 2.6s\u001b[0K\u001b[1G90.5 MiB [] 17% 2.5s\u001b[0K\u001b[1G90.5 MiB [] 18% 2.4s\u001b[0K\u001b[1G90.5 MiB [] 19% 2.4s\u001b[0K\u001b[1G90.5 MiB [] 20% 2.3s\u001b[0K\u001b[1G90.5 MiB [] 21% 2.2s\u001b[0K\u001b[1G90.5 MiB [] 22% 2.2s\u001b[0K\u001b[1G90.5 MiB [] 23% 2.1s\u001b[0K\u001b[1G90.5 MiB [] 24% 2.1s\u001b[0K\u001b[1G90.5 MiB [] 25% 2.0s\u001b[0K\u001b[1G90.5 MiB [] 27% 1.9s\u001b[0K\u001b[1G90.5 MiB [] 28% 1.9s\u001b[0K\u001b[1G90.5 MiB [] 29% 1.8s\u001b[0K\u001b[1G90.5 MiB [] 30% 1.7s\u001b[0K\u001b[1G90.5 MiB [] 30% 1.8s\u001b[0K\u001b[1G90.5 MiB [] 31% 1.8s\u001b[0K\u001b[1G90.5 MiB [] 31% 1.9s\u001b[0K\u001b[1G90.5 MiB [] 32% 2.0s\u001b[0K\u001b[1G90.5 MiB [] 32% 2.1s\u001b[0K\u001b[1G90.5 MiB [] 33% 2.1s\u001b[0K\u001b[1G90.5 MiB [] 33% 2.2s\u001b[0K\u001b[1G90.5 MiB [] 34% 2.2s\u001b[0K\u001b[1G90.5 MiB [] 35% 2.1s\u001b[0K\u001b[1G90.5 MiB [] 37% 2.0s\u001b[0K\u001b[1G90.5 MiB [] 39% 1.8s\u001b[0K\u001b[1G90.5 MiB [] 40% 1.7s\u001b[0K\u001b[1G90.5 MiB [] 42% 1.6s\u001b[0K\u001b[1G90.5 MiB [] 44% 1.6s\u001b[0K\u001b[1G90.5 MiB [] 45% 1.5s\u001b[0K\u001b[1G90.5 MiB [] 47% 1.4s\u001b[0K\u001b[1G90.5 MiB [] 48% 1.3s\u001b[0K\u001b[1G90.5 MiB [] 50% 1.3s\u001b[0K\u001b[1G90.5 MiB [] 51% 1.2s\u001b[0K\u001b[1G90.5 MiB [] 52% 1.2s\u001b[0K\u001b[1G90.5 MiB [] 54% 1.2s\u001b[0K\u001b[1G90.5 MiB [] 55% 1.1s\u001b[0K\u001b[1G90.5 MiB [] 57% 1.0s\u001b[0K\u001b[1G90.5 MiB [] 58% 1.0s\u001b[0K\u001b[1G90.5 MiB [] 59% 1.0s\u001b[0K\u001b[1G90.5 MiB [] 60% 0.9s\u001b[0K\u001b[1G90.5 MiB [] 62% 0.9s\u001b[0K\u001b[1G90.5 MiB [] 63% 0.9s\u001b[0K\u001b[1G90.5 MiB [] 64% 0.8s\u001b[0K\u001b[1G90.5 MiB [] 65% 0.8s\u001b[0K\u001b[1G90.5 MiB [] 66% 0.8s\u001b[0K\u001b[1G90.5 MiB [] 68% 0.7s\u001b[0K\u001b[1G90.5 MiB [] 69% 0.7s\u001b[0K\u001b[1G90.5 MiB [] 70% 0.7s\u001b[0K\u001b[1G90.5 MiB [] 72% 0.7s\u001b[0K\u001b[1G90.5 MiB [] 73% 0.6s\u001b[0K\u001b[1G90.5 MiB [] 74% 0.6s\u001b[0K\u001b[1G90.5 MiB [] 76% 0.6s\u001b[0K\u001b[1G90.5 MiB [] 78% 0.5s\u001b[0K\u001b[1G90.5 MiB [] 79% 0.5s\u001b[0K\u001b[1G90.5 MiB [] 80% 0.5s\u001b[0K\u001b[1G90.5 MiB [] 81% 0.5s\u001b[0K\u001b[1G90.5 MiB [] 82% 0.5s\u001b[0K\u001b[1G90.5 MiB [] 84% 0.4s\u001b[0K\u001b[1G90.5 MiB [] 85% 0.4s\u001b[0K\u001b[1G90.5 MiB [] 87% 0.3s\u001b[0K\u001b[1G90.5 MiB [] 89% 0.3s\u001b[0K\u001b[1G90.5 MiB [] 92% 0.2s\u001b[0K\u001b[1G90.5 MiB [] 94% 0.1s\u001b[0K\u001b[1G90.5 MiB [] 95% 0.1s\u001b[0K\u001b[1G90.5 MiB [] 96% 0.1s\u001b[0K\u001b[1G90.5 MiB [] 98% 0.0s\u001b[0K\u001b[1G90.5 MiB [] 100% 0.0s\u001b[0K\n",
            "Webkit 18.0 (playwright build v2083) downloaded to /root/.cache/ms-playwright/webkit-2083\n",
            "Playwright Host validation warning: \n",
            "╔══════════════════════════════════════════════════════╗\n",
            "║ Host system is missing dependencies to run browsers. ║\n",
            "║ Missing libraries:                                   ║\n",
            "║     libwoff2dec.so.1.0.2                             ║\n",
            "║     libgstgl-1.0.so.0                                ║\n",
            "║     libgstcodecparsers-1.0.so.0                      ║\n",
            "║     libharfbuzz-icu.so.0                             ║\n",
            "║     libenchant-2.so.2                                ║\n",
            "║     libsecret-1.so.0                                 ║\n",
            "║     libhyphen.so.0                                   ║\n",
            "║     libmanette-0.2.so.0                              ║\n",
            "╚══════════════════════════════════════════════════════╝\n",
            "    at validateDependenciesLinux (/usr/local/lib/python3.10/dist-packages/playwright/driver/package/lib/server/registry/dependencies.js:216:9)\n",
            "\u001b[90m    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\u001b[39m\n",
            "    at async Registry._validateHostRequirements (/usr/local/lib/python3.10/dist-packages/playwright/driver/package/lib/server/registry/index.js:707:43)\n",
            "    at async Registry._validateHostRequirementsForExecutableIfNeeded (/usr/local/lib/python3.10/dist-packages/playwright/driver/package/lib/server/registry/index.js:805:7)\n",
            "    at async Registry.validateHostRequirementsForExecutablesIfNeeded (/usr/local/lib/python3.10/dist-packages/playwright/driver/package/lib/server/registry/index.js:794:43)\n",
            "    at async t.<anonymous> (/usr/local/lib/python3.10/dist-packages/playwright/driver/package/lib/cli/program.js:119:7)\n"
          ]
        }
      ],
      "source": [
        "!pip install playwright\n",
        "!playwright install"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iOXkEkR-SA6w"
      },
      "source": [
        "Once that’s done installing, we can go ahead and import our necessary packages:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zL_ml-5iiF0J"
      },
      "outputs": [],
      "source": [
        "from playwright.async_api import async_playwright"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XXWeTiRSSC45"
      },
      "source": [
        "Please keep in mind that we are using `async` because we are running everything inside of a Google Colab notebook.\n",
        "\n",
        "Now, let’s start building our `traderJoesScraper`:"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "E5VKmO4-SIPF"
      },
      "source": [
        "We started off with manually putting in all the links we want to scrape the information off of, please keep in mind that if you’re hoping to turn this into a scalable application it’s recommended to use pagination for this part, but for the sake of simplicity, we can input them manually.\n",
        "\n",
        "Then we just looped through each of the URL’s listed, waited for our main selector to show up that had all the elements we were hoping to scrape, and then extracted our “name” and “price”.\n",
        "\n",
        "Once we ran that, we got a list of all our products from the Fall Faves tag!\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "collapsed": true,
        "id": "DPsW2K0YiIw0",
        "outputId": "eb1ac281-6c6f-4504-fe8a-08d4a63abd79"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Name: Teeny Tiny Pecan Pies, Price: 4.99, Category: Food\n",
            "Name: Hickory Road Smokehouse Uncured Carver Ham, Price: 5.99, Category: Food\n",
            "Name: Frick's Uncured Carver Ham, Price: 5.99, Category: Food\n",
            "Name: Organic Fuyu Persimmons, Price: 3.99, Category: Food\n",
            "Name: Cut Sweet Potatoes, Price: 5.49, Category: Food\n",
            "Name: Jumbo Pomegranate, Price: 2.99, Category: Food\n",
            "Name: Cheesy Herby Biscuits, Price: 4.99, Category: Food\n",
            "Name: Egg Nog Whole Milk Greek Yogurt, Price: 0.99, Category: Food\n",
            "Name: Triple Creme Brie with Calvados Apple Brandy, Price: 12.99, Category: Food\n",
            "Name: Chocolatey Caramel Pretzel Drumstick Decorating Kit, Price: 4.99, Category: Food\n",
            "Name: Hot Chocolate Stirring Spoon, Price: 0.99, Category: Food\n",
            "Name: Gluten Free Turkey Gravy, Price: 3.99, Category: Food\n",
            "Name: Dinner Rolls, Price: 2.99, Category: Food\n",
            "Name: Organic Peeled & Cooked Chestnuts, Price: 4.99, Category: Food\n",
            "Name: Butter with Brown Sugar & Maple Syrup, Price: 2.99, Category: Food\n",
            "Name: Double Fold Alcohol Free Bourbon Vanilla Flavoring, Price: 7.99, Category: Food\n",
            "Name: Delicata Squash, Price: 1.49, Category: Food\n",
            "Name: Spaghetti Squash, Price: 2.99, Category: Food\n",
            "Name: Sugar Bee® Apple, Price: 1.29, Category: Food\n",
            "Name: Savory Squash Pastry Bites, Price: 5.49, Category: Food\n",
            "Name: All Butter Apple Shortbread Cookies, Price: 3.49, Category: Food\n",
            "Name: Caramelized Onion Goat's Milk Cheese, Price: 2.99, Category: Food\n",
            "Name: Pumpkin Loaf, Price: 4.99, Category: Food\n",
            "Name: Pumpkin Butter, Price: 2.99, Category: Food\n",
            "Name: Fresh Cranberries, Price: 2.29, Category: Food\n",
            "Name: Cinnamon Sticks, Price: 2.99, Category: Food\n",
            "Name: Fully Cooked Spiral Sliced Uncured Half Ham, Price: 5.99, Category: Food\n",
            "Name: Non-Dairy Cinnamon Bun Oat Creamer, Price: 1.99, Category: Food\n",
            "Name: Organic Pomegranate, Price: 2.49, Category: Food\n",
            "Name: Organic Cranberries, Price: 2.99, Category: Food\n",
            "Name: Thanksgiving Stuffing Seasoned Popcorn, Price: 2.99, Category: Food\n",
            "Name: Organic Pumpkin, Price: 2.49, Category: Food\n",
            "Name: Apple Overnight Oats, Price: 1.99, Category: Food\n",
            "Name: Mashed Sweet Potatoes, Price: 2.99, Category: Food\n",
            "Name: Halloween Gummies, Price: 4.49, Category: Food\n",
            "Name: Teeny Tiny Apple Pies, Price: 4.99, Category: Food\n",
            "Name: Lil' Tiger Stripe Pumpkin, Price: 1.49, Category: Food\n",
            "Name: Petite Pumpkin Spice Cookies, Price: 3.99, Category: Food\n",
            "Name: Pumpkin Joe-Joe's Cookies, Price: 2.99, Category: Food\n",
            "Name: Double Fold Bourbon Vanilla Extract, Price: 7.99, Category: Food\n",
            "Name: Butternut Squash Italian Lasagna, Price: 4.49, Category: Food\n",
            "Name: Apple Cinnamon Buns, Price: 4.49, Category: Food\n",
            "Name: Cut Butternut Squash, Price: 3.99, Category: Food\n",
            "Name: Pumpkin Pie Spice, Price: 2.99, Category: Food\n",
            "Name: Mini Maple Flavored Marshmallows, Price: 2.99, Category: Food\n",
            "Name: Maple Flavored Fudge, Price: 2.99, Category: Food\n",
            "Name: Double Crème Brie with Truffles, Price: 9.99, Category: Food\n",
            "Name: Truffle Dip, Price: 5.49, Category: Food\n",
            "Name: Pumpkin Pie, Price: 6.99, Category: Food\n",
            "Name: Haricots Verts, Price: 5.99, Category: Food\n",
            "Name: Herbes de Provence, Price: 4.99, Category: Food\n",
            "Name: Creamy Toscano Cheese Dusted with Cinnamon, Price: 10.99, Category: Food\n",
            "Name: Caramel Apple Mochi, Price: 4.99, Category: Food\n",
            "Name: Cornbread Stuffing, Price: 5.99, Category: Food\n",
            "Name: Pumpkin Spiced Joe-Joe's Sandwich Cookies, Price: 4.49, Category: Food\n",
            "Name: Salted Maple Ice Cream, Price: 3.79, Category: Food\n",
            "Name: Roasted Turkey & Sweet Potato Burrito, Price: 4.49, Category: Food\n",
            "Name: Cinnamon Roll Blondie Bar Baking Mix, Price: 3.99, Category: Food\n",
            "Name: Pumpkin Cheesecake Croissants, Price: 4.49, Category: Food\n",
            "Name: Apple Cider Donuts, Price: 4.49, Category: Food\n",
            "Name: Pumpkin Overnight Oats, Price: 1.99, Category: Food\n",
            "Name: Fuyu Persimmons, Price: 0.79, Category: Food\n",
            "Name: Cut Butternut Squash, Price: 3.99, Category: Food\n",
            "Name: Brussels Sprouts, Price: 4.99, Category: Food\n",
            "Name: Thanksgiving Stuffing Seasoned Kettle Chips, Price: 2.99, Category: Food\n",
            "Name: Nuts About Rosemary Mix, Price: 7.99, Category: Food\n",
            "Name: Brined Bone-In Half Turkey Breast Fully Cooked, Price: 9.99, Category: Food\n",
            "Name: Harvest Apple Salad Kit, Price: 3.99, Category: Food\n",
            "Name: Cornbread Stuffing Mix, Price: 4.99, Category: Food\n",
            "Name: Condensed Cream of Portabella Mushroom Soup, Price: 1.99, Category: Food\n",
            "Name: All Butter Puff Pastry Sheets, Price: 4.99, Category: Food\n",
            "Name: Turkey Gravy, Price: 1.69, Category: Food\n",
            "Name: Nantucket Style Cranberry Pie, Price: 6.99, Category: Food\n",
            "Name: White Stilton with Cranberries, Price: 11.99, Category: Food\n",
            "Name: Truffle Salami, Price: 4.99, Category: Food\n",
            "Name: Autumn Maple Coffee, Price: 8.99, Category: Beverage\n",
            "Name: Triple Ginger Brew Sparkling Beverage, Price: 3.99, Category: Beverage\n",
            "Name: Harvest Blend Herbal Tea, Price: 2.49, Category: Beverage\n",
            "Name: Non-Dairy Oat Beverage Maple Flavor, Price: 2.99, Category: Beverage\n",
            "Name: Maple Espresso Black Tea Blend, Price: 2.99, Category: Beverage\n",
            "Name: Non-Dairy Pumpkin Oat Beverage, Price: 2.99, Category: Beverage\n",
            "Name: Mum Fleurettes, Price: 4.99, Category: Flowers&Plants\n",
            "Name: Assorted Mum Plants, Price: 6.99, Category: Flowers&Plants\n",
            "Name: Eight Candles, Price: 4.49, Category: EverythingElse\n",
            "Name: Orange & Spice Scented Candle & Room Spritz, Price: 5.99, Category: EverythingElse\n",
            "Name: Harvest Brunch Dog Treats, Price: 3.49, Category: EverythingElse\n",
            "Name: Cinnamon Whisk, Price: 1.29, Category: EverythingElse\n",
            "Name: Cinnamon Broom, Price: 4.99, Category: EverythingElse\n",
            "Name: Pumpkin Maple Bacon Stuffies Dog Treats, Price: 4.49, Category: EverythingElse\n",
            "[{'name': 'Teeny Tiny Pecan Pies', 'price': 4.99, 'category': 'Food'}, {'name': 'Hickory Road Smokehouse Uncured Carver Ham', 'price': 5.99, 'category': 'Food'}, {'name': \"Frick's Uncured Carver Ham\", 'price': 5.99, 'category': 'Food'}, {'name': 'Organic Fuyu Persimmons', 'price': 3.99, 'category': 'Food'}, {'name': 'Cut Sweet Potatoes', 'price': 5.49, 'category': 'Food'}, {'name': 'Jumbo Pomegranate', 'price': 2.99, 'category': 'Food'}, {'name': 'Cheesy Herby Biscuits', 'price': 4.99, 'category': 'Food'}, {'name': 'Egg Nog Whole Milk Greek Yogurt', 'price': 0.99, 'category': 'Food'}, {'name': 'Triple Creme Brie with Calvados Apple Brandy', 'price': 12.99, 'category': 'Food'}, {'name': 'Chocolatey Caramel Pretzel Drumstick Decorating Kit', 'price': 4.99, 'category': 'Food'}, {'name': 'Hot Chocolate Stirring Spoon', 'price': 0.99, 'category': 'Food'}, {'name': 'Gluten Free Turkey Gravy', 'price': 3.99, 'category': 'Food'}, {'name': 'Dinner Rolls', 'price': 2.99, 'category': 'Food'}, {'name': 'Organic Peeled & Cooked Chestnuts', 'price': 4.99, 'category': 'Food'}, {'name': 'Butter with Brown Sugar & Maple Syrup', 'price': 2.99, 'category': 'Food'}, {'name': 'Double Fold Alcohol Free Bourbon Vanilla Flavoring', 'price': 7.99, 'category': 'Food'}, {'name': 'Delicata Squash', 'price': 1.49, 'category': 'Food'}, {'name': 'Spaghetti Squash', 'price': 2.99, 'category': 'Food'}, {'name': 'Sugar Bee® Apple', 'price': 1.29, 'category': 'Food'}, {'name': 'Savory Squash Pastry Bites', 'price': 5.49, 'category': 'Food'}, {'name': 'All Butter Apple Shortbread Cookies', 'price': 3.49, 'category': 'Food'}, {'name': \"Caramelized Onion Goat's Milk Cheese\", 'price': 2.99, 'category': 'Food'}, {'name': 'Pumpkin Loaf', 'price': 4.99, 'category': 'Food'}, {'name': 'Pumpkin Butter', 'price': 2.99, 'category': 'Food'}, {'name': 'Fresh Cranberries', 'price': 2.29, 'category': 'Food'}, {'name': 'Cinnamon Sticks', 'price': 2.99, 'category': 'Food'}, {'name': 'Fully Cooked Spiral Sliced Uncured Half Ham', 'price': 5.99, 'category': 'Food'}, {'name': 'Non-Dairy Cinnamon Bun Oat Creamer', 'price': 1.99, 'category': 'Food'}, {'name': 'Organic Pomegranate', 'price': 2.49, 'category': 'Food'}, {'name': 'Organic Cranberries', 'price': 2.99, 'category': 'Food'}, {'name': 'Thanksgiving Stuffing Seasoned Popcorn', 'price': 2.99, 'category': 'Food'}, {'name': 'Organic Pumpkin', 'price': 2.49, 'category': 'Food'}, {'name': 'Apple Overnight Oats', 'price': 1.99, 'category': 'Food'}, {'name': 'Mashed Sweet Potatoes', 'price': 2.99, 'category': 'Food'}, {'name': 'Halloween Gummies', 'price': 4.49, 'category': 'Food'}, {'name': 'Teeny Tiny Apple Pies', 'price': 4.99, 'category': 'Food'}, {'name': \"Lil' Tiger Stripe Pumpkin\", 'price': 1.49, 'category': 'Food'}, {'name': 'Petite Pumpkin Spice Cookies', 'price': 3.99, 'category': 'Food'}, {'name': \"Pumpkin Joe-Joe's Cookies\", 'price': 2.99, 'category': 'Food'}, {'name': 'Double Fold Bourbon Vanilla Extract', 'price': 7.99, 'category': 'Food'}, {'name': 'Butternut Squash Italian Lasagna', 'price': 4.49, 'category': 'Food'}, {'name': 'Apple Cinnamon Buns', 'price': 4.49, 'category': 'Food'}, {'name': 'Cut Butternut Squash', 'price': 3.99, 'category': 'Food'}, {'name': 'Pumpkin Pie Spice', 'price': 2.99, 'category': 'Food'}, {'name': 'Mini Maple Flavored Marshmallows', 'price': 2.99, 'category': 'Food'}, {'name': 'Maple Flavored Fudge', 'price': 2.99, 'category': 'Food'}, {'name': 'Double Crème Brie with Truffles', 'price': 9.99, 'category': 'Food'}, {'name': 'Truffle Dip', 'price': 5.49, 'category': 'Food'}, {'name': 'Pumpkin Pie', 'price': 6.99, 'category': 'Food'}, {'name': 'Haricots Verts', 'price': 5.99, 'category': 'Food'}, {'name': 'Herbes de Provence', 'price': 4.99, 'category': 'Food'}, {'name': 'Creamy Toscano Cheese Dusted with Cinnamon', 'price': 10.99, 'category': 'Food'}, {'name': 'Caramel Apple Mochi', 'price': 4.99, 'category': 'Food'}, {'name': 'Cornbread Stuffing', 'price': 5.99, 'category': 'Food'}, {'name': \"Pumpkin Spiced Joe-Joe's Sandwich Cookies\", 'price': 4.49, 'category': 'Food'}, {'name': 'Salted Maple Ice Cream', 'price': 3.79, 'category': 'Food'}, {'name': 'Roasted Turkey & Sweet Potato Burrito', 'price': 4.49, 'category': 'Food'}, {'name': 'Cinnamon Roll Blondie Bar Baking Mix', 'price': 3.99, 'category': 'Food'}, {'name': 'Pumpkin Cheesecake Croissants', 'price': 4.49, 'category': 'Food'}, {'name': 'Apple Cider Donuts', 'price': 4.49, 'category': 'Food'}, {'name': 'Pumpkin Overnight Oats', 'price': 1.99, 'category': 'Food'}, {'name': 'Fuyu Persimmons', 'price': 0.79, 'category': 'Food'}, {'name': 'Cut Butternut Squash', 'price': 3.99, 'category': 'Food'}, {'name': 'Brussels Sprouts', 'price': 4.99, 'category': 'Food'}, {'name': 'Thanksgiving Stuffing Seasoned Kettle Chips', 'price': 2.99, 'category': 'Food'}, {'name': 'Nuts About Rosemary Mix', 'price': 7.99, 'category': 'Food'}, {'name': 'Brined Bone-In Half Turkey Breast Fully Cooked', 'price': 9.99, 'category': 'Food'}, {'name': 'Harvest Apple Salad Kit', 'price': 3.99, 'category': 'Food'}, {'name': 'Cornbread Stuffing Mix', 'price': 4.99, 'category': 'Food'}, {'name': 'Condensed Cream of Portabella Mushroom Soup', 'price': 1.99, 'category': 'Food'}, {'name': 'All Butter Puff Pastry Sheets', 'price': 4.99, 'category': 'Food'}, {'name': 'Turkey Gravy', 'price': 1.69, 'category': 'Food'}, {'name': 'Nantucket Style Cranberry Pie', 'price': 6.99, 'category': 'Food'}, {'name': 'White Stilton with Cranberries', 'price': 11.99, 'category': 'Food'}, {'name': 'Truffle Salami', 'price': 4.99, 'category': 'Food'}, {'name': 'Autumn Maple Coffee', 'price': 8.99, 'category': 'Beverage'}, {'name': 'Triple Ginger Brew Sparkling Beverage', 'price': 3.99, 'category': 'Beverage'}, {'name': 'Harvest Blend Herbal Tea', 'price': 2.49, 'category': 'Beverage'}, {'name': 'Non-Dairy Oat Beverage Maple Flavor', 'price': 2.99, 'category': 'Beverage'}, {'name': 'Maple Espresso Black Tea Blend', 'price': 2.99, 'category': 'Beverage'}, {'name': 'Non-Dairy Pumpkin Oat Beverage', 'price': 2.99, 'category': 'Beverage'}, {'name': 'Mum Fleurettes', 'price': 4.99, 'category': 'Flowers&Plants'}, {'name': 'Assorted Mum Plants', 'price': 6.99, 'category': 'Flowers&Plants'}, {'name': 'Eight Candles', 'price': 4.49, 'category': 'EverythingElse'}, {'name': 'Orange & Spice Scented Candle & Room Spritz', 'price': 5.99, 'category': 'EverythingElse'}, {'name': 'Harvest Brunch Dog Treats', 'price': 3.49, 'category': 'EverythingElse'}, {'name': 'Cinnamon Whisk', 'price': 1.29, 'category': 'EverythingElse'}, {'name': 'Cinnamon Broom', 'price': 4.99, 'category': 'EverythingElse'}, {'name': 'Pumpkin Maple Bacon Stuffies Dog Treats', 'price': 4.49, 'category': 'EverythingElse'}]\n"
          ]
        }
      ],
      "source": [
        "async def traderJoesScraper():\n",
        "    async with async_playwright() as playwright:\n",
        "        # use headless mode since we are using Colab\n",
        "        browser = await playwright.chromium.launch(headless=True)\n",
        "        page = await browser.new_page()\n",
        "\n",
        "        # all the URLs for my foods, bevs, flowers&plants, and everything else categories\n",
        "        pages = [\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/food-8?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%7D\",\n",
        "                \"category\": \"Food\",\n",
        "            },\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/food-8?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%2C%22page%22%3A2%7D\",\n",
        "                \"category\": \"Food\",\n",
        "            },\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/food-8?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%2C%22page%22%3A3%7D\",\n",
        "                \"category\": \"Food\",\n",
        "            },\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/food-8?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%2C%22page%22%3A4%7D\",\n",
        "                \"category\": \"Food\",\n",
        "            },\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/food-8?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%2C%22page%22%3A5%7D\",\n",
        "                \"category\": \"Food\",\n",
        "            },\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/beverages-182?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%7D\",\n",
        "                \"category\": \"Beverage\",\n",
        "            },\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/flowers-plants-203?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%7D\",\n",
        "                \"category\": \"Flowers&Plants\",\n",
        "            },\n",
        "            {\n",
        "                \"url\": \"https://www.traderjoes.com/home/products/category/everything-else-215?filters=%7B%22tags%22%3A%5B%22Fall+Faves%22%5D%7D\",\n",
        "                \"category\": \"EverythingElse\",\n",
        "            },\n",
        "        ]\n",
        "\n",
        "        items = []\n",
        "\n",
        "        # loop through each URL\n",
        "        for info in pages:\n",
        "            await page.goto(info[\"url\"])\n",
        "\n",
        "            # let page load\n",
        "            await page.wait_for_selector(\n",
        "                \"li.ProductList_productList__item__1EIvq\",\n",
        "                state=\"attached\",\n",
        "                timeout=60000,\n",
        "            )\n",
        "\n",
        "            # li.ProductList_productList__item__1EIvq is where all our info lives\n",
        "            products = await page.query_selector_all(\n",
        "                \"li.ProductList_productList__item__1EIvq\"\n",
        "            )\n",
        "\n",
        "            # get all our info\n",
        "            for product in products:\n",
        "                result = {}\n",
        "\n",
        "                name = await product.query_selector(\n",
        "                    \"h2.ProductCard_card__title__text__uiWLe a\"\n",
        "                )\n",
        "                price = await product.query_selector(\n",
        "                    \"span.ProductPrice_productPrice__price__3-50j\"\n",
        "                )\n",
        "\n",
        "                if name and price:\n",
        "                    result[\"name\"] = await name.inner_text()\n",
        "\n",
        "                    # have to make price a number\n",
        "                    price_text = await price.inner_text()\n",
        "                    convert_price = float(price_text.replace(\"$\", \"\").strip())\n",
        "                    result[\"price\"] = convert_price\n",
        "\n",
        "                    # category is so we can save it nicely later\n",
        "                    result[\"category\"] = info[\"category\"]\n",
        "                    items.append(result)\n",
        "\n",
        "        for item in items:\n",
        "            print(\n",
        "                f\"Name: {item['name']}, Price: {item['price']}, Category: {item['category']}\"\n",
        "            )\n",
        "\n",
        "        await browser.close()\n",
        "        return items\n",
        "\n",
        "\n",
        "scraped_products = await traderJoesScraper()\n",
        "print(scraped_products)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_FvDQnJbSM5G"
      },
      "source": [
        "To keep track of the items, we can go ahead and quickly count them:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "WbK-pB3diVVv",
        "outputId": "91d4188d-eb46-4dec-9e0f-2883e4bab297"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "89\n"
          ]
        }
      ],
      "source": [
        "scraped_products_count = len(scraped_products)\n",
        "print(scraped_products_count)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vVsWZwYwSQRo"
      },
      "source": [
        "As of the date this was scrapped we had 89 products."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qG9xp4g5SQ-7"
      },
      "source": [
        "Now, let’s go ahead and save our products into a `.txt` file so we can use it later in our tutorial when we are using our LlamaIndex and Atlas Vector Search integration. Go ahead and name the file whatever you like, for sake of tracking I’m naming mine: `tj_fall_faves_oct30.txt`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Vpq4LhtRrDVP"
      },
      "outputs": [],
      "source": [
        "with open(\"tj_fall_faves_oct30.txt\", \"w\") as f:\n",
        "    for item in scraped_products:\n",
        "        f.write(\n",
        "            f\"Name: {item['name']}, Price: ${item['price']}, Category: {item['category']}\\n\"\n",
        "        )"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Rjhx1kKdidQr"
      },
      "outputs": [],
      "source": [
        "# save into .csv file with name, price, and category columns\n",
        "import pandas as pd\n",
        "\n",
        "df = pd.DataFrame(scraped_products)\n",
        "\n",
        "csv_path = \"tj_fall_faves_oct30.csv\"\n",
        "df.to_csv(csv_path, index=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "V2Ioq8IeSXky"
      },
      "source": [
        "Since we are using a notebook, please make sure that you download the file locally, since once our runtime is disconnected the `.txt` file will be lost.\n",
        "\n",
        "Now that we have all our Trader Joe’s fall products let’s go ahead and build out our AI Party Planner!\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LMuoPlEwi2bq"
      },
      "source": [
        "## Part 2: LlamaIndex and Atlas Vector Search Integration"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qTkfhJ_3SaNY"
      },
      "source": [
        "This is the quickstart we are using in order to be successful with this part of the tutorial: https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/llamaindex/#:~:text=You%20can%20integrate%20Atlas%20Vector,RAG). We will be going over how to use Atlas Vector Search with LlamaIndex to build a RAG application with chat capabilities!\n",
        "\n",
        "This section will cover in detail how to set up the environment, store our custom data that we previously scraped on Atlas, create an Atlas Vector Search index on top of our data, and to finish up we will implement RAG and will use Atlas Vector Search to answer questions from our unique data store.\n",
        "\n",
        "\n",
        "Let’s first use `pip` to install all our necessary libraries. We will need to include `llama-index`, `llama-index-vector-stores-mongodb`, and `llama-index-embeddings-openai`.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "collapsed": true,
        "id": "rT7cIJGOi8bb",
        "outputId": "015aaad4-4786-45d4-96de-13ff4128e692"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\u001b[?25l   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/1.4 MB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K   \u001b[91m━━━━━━━━━━━━\u001b[0m\u001b[91m╸\u001b[0m\u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.5/1.4 MB\u001b[0m \u001b[31m14.5 MB/s\u001b[0m eta \u001b[36m0:00:01\u001b[0m\r\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.4/1.4 MB\u001b[0m \u001b[31m20.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m313.6/313.6 kB\u001b[0m \u001b[31m20.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m53.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.2/1.2 MB\u001b[0m \u001b[31m49.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.5/1.5 MB\u001b[0m \u001b[31m55.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m386.9/386.9 kB\u001b[0m \u001b[31m26.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m76.4/76.4 kB\u001b[0m \u001b[31m6.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m78.0/78.0 kB\u001b[0m \u001b[31m6.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m325.2/325.2 kB\u001b[0m \u001b[31m21.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m176.8/176.8 kB\u001b[0m \u001b[31m13.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m295.8/295.8 kB\u001b[0m \u001b[31m18.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.2/1.2 MB\u001b[0m \u001b[31m49.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.5/49.5 kB\u001b[0m \u001b[31m3.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K   \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m4.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h"
          ]
        }
      ],
      "source": [
        "pip install --quiet --upgrade llama-index llama-index-vector-stores-mongodb llama-index-embeddings-openai pymongo"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ukXtxk2US2UA"
      },
      "source": [
        "Now go ahead and import in your necessary import statements:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "NlHnxQ10jIT_"
      },
      "outputs": [],
      "source": [
        "import getpass\n",
        "import os\n",
        "\n",
        "import pymongo\n",
        "from llama_index.core import SimpleDirectoryReader, StorageContext, VectorStoreIndex\n",
        "from llama_index.core.query_engine import RetrieverQueryEngine\n",
        "from llama_index.core.retrievers import VectorIndexRetriever\n",
        "from llama_index.core.settings import Settings\n",
        "from llama_index.embeddings.openai import OpenAIEmbedding\n",
        "from llama_index.llms.openai import OpenAI\n",
        "from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch\n",
        "from pymongo.operations import SearchIndexModel"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QUyUFyVIS4sA"
      },
      "source": [
        "Input your OpenAI API Key and your MongoDB Atlas cluster connection string when prompted:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "PTmjKiJtkFkc",
        "outputId": "480688d2-e096-4167-921f-095bcf0825ef"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "OpenAI API Key:··········\n",
            "MongoDB Atlas SRV Connection String:··········\n"
          ]
        }
      ],
      "source": [
        "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
        "ATLAS_CONNECTION_STRING = getpass.getpass(\"MongoDB Atlas SRV Connection String:\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7cnserS0S66-"
      },
      "source": [
        "Once your keys are in, let’s go ahead and assign our specific models for `llama_index` so it knows how to properly embed our file. This is just to keep everything consistent!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "MMSStFZmkOya"
      },
      "outputs": [],
      "source": [
        "Settings.llm = OpenAI()\n",
        "Settings.embed_model = OpenAIEmbedding(model=\"text-embedding-ada-002\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ofR8Dt1WS9MH"
      },
      "source": [
        "Now we can go ahead and read in our `.txt` file with our scraped products. We are doing this using the `SimpleDirectoryReader` from `llama_index`. Text files aren’t the only files that can be nicely loaded into LlamaIndex. There are a ton of other supported methods and I recommend checking out some of their [supported file types](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader/?gad_source=1&gclid=Cj0KCQjwsoe5BhDiARIsAOXVoUsbgqjQcjmkV_KFLzS0TwUcONhaXfTaVT-C71A8Py_dHPHHSs-hmMsaAsbaEALw_wcB).\n",
        "\n",
        "So here we are just reading the contents of our file and then returning it as a list of documents; the format LlamaIndex requires.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "collapsed": true,
        "id": "Re23w1yUmWWa",
        "outputId": "8b60038e-031f-4291-9c89-ad250c5598b9"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "Document(id_='757e7f69-9104-48f4-8b0b-49a4d77e7350', embedding=None, metadata={'file_path': '/content/tj_fall_faves_oct30.txt', 'file_name': 'tj_fall_faves_oct30.txt', 'file_type': 'text/plain', 'file_size': 5672, 'creation_date': '2024-10-30', 'last_modified_date': '2024-10-30'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text=\"Name: Teeny Tiny Pecan Pies, Price: $4.99, Category: Food\\nName: Hickory Road Smokehouse Uncured Carver Ham, Price: $5.99, Category: Food\\nName: Frick's Uncured Carver Ham, Price: $5.99, Category: Food\\nName: Organic Fuyu Persimmons, Price: $3.99, Category: Food\\nName: Cut Sweet Potatoes, Price: $5.49, Category: Food\\nName: Jumbo Pomegranate, Price: $2.99, Category: Food\\nName: Cheesy Herby Biscuits, Price: $4.99, Category: Food\\nName: Egg Nog Whole Milk Greek Yogurt, Price: $0.99, Category: Food\\nName: Triple Creme Brie with Calvados Apple Brandy, Price: $12.99, Category: Food\\nName: Chocolatey Caramel Pretzel Drumstick Decorating Kit, Price: $4.99, Category: Food\\nName: Hot Chocolate Stirring Spoon, Price: $0.99, Category: Food\\nName: Gluten Free Turkey Gravy, Price: $3.99, Category: Food\\nName: Dinner Rolls, Price: $2.99, Category: Food\\nName: Organic Peeled & Cooked Chestnuts, Price: $4.99, Category: Food\\nName: Butter with Brown Sugar & Maple Syrup, Price: $2.99, Category: Food\\nName: Double Fold Alcohol Free Bourbon Vanilla Flavoring, Price: $7.99, Category: Food\\nName: Delicata Squash, Price: $1.49, Category: Food\\nName: Spaghetti Squash, Price: $2.99, Category: Food\\nName: Sugar Bee® Apple, Price: $1.29, Category: Food\\nName: Savory Squash Pastry Bites, Price: $5.49, Category: Food\\nName: All Butter Apple Shortbread Cookies, Price: $3.49, Category: Food\\nName: Caramelized Onion Goat's Milk Cheese, Price: $2.99, Category: Food\\nName: Pumpkin Loaf, Price: $4.99, Category: Food\\nName: Pumpkin Butter, Price: $2.99, Category: Food\\nName: Fresh Cranberries, Price: $2.29, Category: Food\\nName: Cinnamon Sticks, Price: $2.99, Category: Food\\nName: Fully Cooked Spiral Sliced Uncured Half Ham, Price: $5.99, Category: Food\\nName: Non-Dairy Cinnamon Bun Oat Creamer, Price: $1.99, Category: Food\\nName: Organic Pomegranate, Price: $2.49, Category: Food\\nName: Organic Cranberries, Price: $2.99, Category: Food\\nName: Thanksgiving Stuffing Seasoned Popcorn, Price: $2.99, Category: Food\\nName: Organic Pumpkin, Price: $2.49, Category: Food\\nName: Apple Overnight Oats, Price: $1.99, Category: Food\\nName: Mashed Sweet Potatoes, Price: $2.99, Category: Food\\nName: Halloween Gummies, Price: $4.49, Category: Food\\nName: Teeny Tiny Apple Pies, Price: $4.99, Category: Food\\nName: Lil' Tiger Stripe Pumpkin, Price: $1.49, Category: Food\\nName: Petite Pumpkin Spice Cookies, Price: $3.99, Category: Food\\nName: Pumpkin Joe-Joe's Cookies, Price: $2.99, Category: Food\\nName: Double Fold Bourbon Vanilla Extract, Price: $7.99, Category: Food\\nName: Butternut Squash Italian Lasagna, Price: $4.49, Category: Food\\nName: Apple Cinnamon Buns, Price: $4.49, Category: Food\\nName: Cut Butternut Squash, Price: $3.99, Category: Food\\nName: Pumpkin Pie Spice, Price: $2.99, Category: Food\\nName: Mini Maple Flavored Marshmallows, Price: $2.99, Category: Food\\nName: Maple Flavored Fudge, Price: $2.99, Category: Food\\nName: Double Crème Brie with Truffles, Price: $9.99, Category: Food\\nName: Truffle Dip, Price: $5.49, Category: Food\\nName: Pumpkin Pie, Price: $6.99, Category: Food\\nName: Haricots Verts, Price: $5.99, Category: Food\\nName: Herbes de Provence, Price: $4.99, Category: Food\\nName: Creamy Toscano Cheese Dusted with Cinnamon, Price: $10.99, Category: Food\\nName: Caramel Apple Mochi, Price: $4.99, Category: Food\\nName: Cornbread Stuffing, Price: $5.99, Category: Food\\nName: Pumpkin Spiced Joe-Joe's Sandwich Cookies, Price: $4.49, Category: Food\\nName: Salted Maple Ice Cream, Price: $3.79, Category: Food\\nName: Roasted Turkey & Sweet Potato Burrito, Price: $4.49, Category: Food\\nName: Cinnamon Roll Blondie Bar Baking Mix, Price: $3.99, Category: Food\\nName: Pumpkin Cheesecake Croissants, Price: $4.49, Category: Food\\nName: Apple Cider Donuts, Price: $4.49, Category: Food\\nName: Pumpkin Overnight Oats, Price: $1.99, Category: Food\\nName: Fuyu Persimmons, Price: $0.79, Category: Food\\nName: Cut Butternut Squash, Price: $3.99, Category: Food\\nName: Brussels Sprouts, Price: $4.99, Category: Food\\nName: Thanksgiving Stuffing Seasoned Kettle Chips, Price: $2.99, Category: Food\\nName: Nuts About Rosemary Mix, Price: $7.99, Category: Food\\nName: Brined Bone-In Half Turkey Breast Fully Cooked, Price: $9.99, Category: Food\\nName: Harvest Apple Salad Kit, Price: $3.99, Category: Food\\nName: Cornbread Stuffing Mix, Price: $4.99, Category: Food\\nName: Condensed Cream of Portabella Mushroom Soup, Price: $1.99, Category: Food\\nName: All Butter Puff Pastry Sheets, Price: $4.99, Category: Food\\nName: Turkey Gravy, Price: $1.69, Category: Food\\nName: Nantucket Style Cranberry Pie, Price: $6.99, Category: Food\\nName: White Stilton with Cranberries, Price: $11.99, Category: Food\\nName: Truffle Salami, Price: $4.99, Category: Food\\nName: Autumn Maple Coffee, Price: $8.99, Category: Beverage\\nName: Triple Ginger Brew Sparkling Beverage, Price: $3.99, Category: Beverage\\nName: Harvest Blend Herbal Tea, Price: $2.49, Category: Beverage\\nName: Non-Dairy Oat Beverage Maple Flavor, Price: $2.99, Category: Beverage\\nName: Maple Espresso Black Tea Blend, Price: $2.99, Category: Beverage\\nName: Non-Dairy Pumpkin Oat Beverage, Price: $2.99, Category: Beverage\\nName: Mum Fleurettes, Price: $4.99, Category: Flowers&Plants\\nName: Assorted Mum Plants, Price: $6.99, Category: Flowers&Plants\\nName: Eight Candles, Price: $4.49, Category: EverythingElse\\nName: Orange & Spice Scented Candle & Room Spritz, Price: $5.99, Category: EverythingElse\\nName: Harvest Brunch Dog Treats, Price: $3.49, Category: EverythingElse\\nName: Cinnamon Whisk, Price: $1.29, Category: EverythingElse\\nName: Cinnamon Broom, Price: $4.99, Category: EverythingElse\\nName: Pumpkin Maple Bacon Stuffies Dog Treats, Price: $4.49, Category: EverythingElse\\n\", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\\n\\n{content}', metadata_template='{key}: {value}', metadata_seperator='\\n')"
            ]
          },
          "execution_count": 12,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "sample_data = SimpleDirectoryReader(\n",
        "    input_files=[\"/content/tj_fall_faves_oct30.txt\"]\n",
        ").load_data()\n",
        "sample_data[0]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZekXaCaMS_W-"
      },
      "source": [
        "Now that our file has been read in, let’s connect to our MongoDB Atlas cluster and set up a vector store! Feel free to name the database and collection anything you like. We are initializing a vector store using `MongoAtlasVectorSearch` from `llama_index` which will allow us to work with our embedded documents directly in our cluster."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "lLdu5DrJoEHJ"
      },
      "outputs": [],
      "source": [
        "# Connect to your Atlas cluster\n",
        "mongo_client = pymongo.MongoClient(\n",
        "    ATLAS_CONNECTION_STRING, appname=\"devrel.showcase.tj_fall_faves\"\n",
        ")\n",
        "\n",
        "# Instantiate the vector store\n",
        "atlas_vector_store = MongoDBAtlasVectorSearch(\n",
        "    mongo_client,\n",
        "    db_name=\"tj_products\",\n",
        "    collection_name=\"fall_faves\",\n",
        "    vector_index_name=\"vector_index\",\n",
        ")\n",
        "vector_store_context = StorageContext.from_defaults(vector_store=atlas_vector_store)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Yr6M8h03TDer"
      },
      "source": [
        "Since our vector store has been defined (by our `vector_store_context`) let’s go ahead and create a vector index in MongoDB for our documents in `sample_data`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 81,
          "referenced_widgets": [
            "abb994db236e4786b97a07d231473dd0",
            "65985a39477a4e9b957706278b6e42d5",
            "d2be4f26b70240eeba464db8310aedde",
            "171aaf7fb30946c396f09025f8b1c304",
            "e27576e3243a4e6fb5a342a66f8e3c1b",
            "511ce4303f49467caeea0eee7214ce88",
            "7d10d5a0e7ec45c2bb6b19465259af70",
            "af723eee69994e02a73cc0ce6b75ff93",
            "e1524470f5594ae69784be51402b2991",
            "20da10bc61594334a5e299d4caae4586",
            "d7e04d7dd43647be9133dc7b9d70694a",
            "5e819550a7154f5fbc818aadfd487561",
            "e543fb1594b4495fadbe4966874fceb5",
            "b1c7a41c18fd4dcea8b47fdee95870ee",
            "be66c6821b264181830a7f449a2cc26c",
            "c20bda0c76444c56ba6a8603595912c4",
            "1719a53ee20243bd922c73a4431ebb20",
            "0c58378f5668408bb97d60daaa68d458",
            "2d0edfb9a3564e3da5f78ae1787b0bfc",
            "9a79752d1d644dc0ba4c6280532585f3",
            "3af0e639e4f5495e9a4b1c71f79d9cc5",
            "8ab80dc15e0240f39c6f0d86e731b759"
          ]
        },
        "id": "J8g2Ab4_oQoP",
        "outputId": "a43ec7e3-b2e4-48ce-e630-d661d9d815e1"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "abb994db236e4786b97a07d231473dd0",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "5e819550a7154f5fbc818aadfd487561",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Generating embeddings:   0%|          | 0/2 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "vector_store_index = VectorStoreIndex.from_documents(\n",
        "    sample_data, storage_context=vector_store_context, show_progress=True\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JTTwEvCrTF18"
      },
      "source": [
        "Once this cell has run you can go ahead and view your data with the embeddings inside of your Atlas cluster.\n",
        "\n",
        "In order to allow for vector search queries on our created vector store, we need to create an Atlas Vector Search index on our tj_products.fall_faves collection. We can do this either through the [Atlas UI](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/) or directly from our notebook:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "id": "gEfP5K8Nrrry",
        "outputId": "c0d97a40-255f-4860-fe3d-5989627c4155"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            },
            "text/plain": [
              "'vector_index'"
            ]
          },
          "execution_count": 15,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# Specify the collection for which to create the index\n",
        "collection = mongo_client[\"tj_products\"][\"fall_faves\"]\n",
        "\n",
        "# Create your index model, then create the search index\n",
        "search_index_model = SearchIndexModel(\n",
        "    definition={\n",
        "        \"fields\": [\n",
        "            {\n",
        "                \"type\": \"vector\",\n",
        "                \"path\": \"embedding\",\n",
        "                \"numDimensions\": 1536,\n",
        "                \"similarity\": \"cosine\",\n",
        "            },\n",
        "            {\"type\": \"filter\", \"path\": \"metadata.page_label\"},\n",
        "        ]\n",
        "    },\n",
        "    name=\"vector_index\",\n",
        "    type=\"vectorSearch\",\n",
        ")\n",
        "\n",
        "collection.create_search_index(model=search_index_model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "F398Wr-aTLUJ"
      },
      "source": [
        "You’ll be able to see this index once it’s up and running under your “Atlas Search” tab in your Atlas UI. Once it’s done, we can start querying our data and we can do some basic RAG."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TxAZQzWOTNMK"
      },
      "source": [
        "## Part 3: Basic RAG"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "v3rIQipnTPHB"
      },
      "source": [
        "With our Atlas Vector Search index up and running we are ready to have some fun and bring our AI Party Planner to life! We are going to continue with this dream team where we will use Atlas Vector Search to get our documents and LlamaIndex’s query engine to actually answer our questions based on our documents.\n",
        "\n",
        "To do this, we will need to have Atlas Vector Search become a [vector index retriever](https://docs.llamaindex.ai/en/stable/api_reference/retrievers/vector/) and we will need to initialize a `RetrieverQueryEngine` to handle queries by passing each question through our vector retrieval system. This combination will allow us to ask any questions we want in natural language, and it will match us with the most accurate documents."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "hFekV23NsKrN",
        "outputId": "f6c483f4-2eea-4c6c-cc35-4b21ba96c6b4"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Mum Fleurettes are available for $4.99 and Assorted Mum Plants are available for $6.99.\n"
          ]
        }
      ],
      "source": [
        "# Instantiate Atlas Vector Search as a retriever\n",
        "vector_store_retriever = VectorIndexRetriever(\n",
        "    index=vector_store_index, similarity_top_k=5\n",
        ")\n",
        "\n",
        "# Pass the retriever into the query engine\n",
        "query_engine = RetrieverQueryEngine(retriever=vector_store_retriever)\n",
        "\n",
        "# Prompt the LLM\n",
        "response = query_engine.query(\n",
        "    \"Which plant items are available right now? Please provide prices\"\n",
        ")\n",
        "\n",
        "print(response)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6cIgtzAPTUQg"
      },
      "source": [
        "But what if we want to keep asking questions and get responses with memory? Let’s quickly build a [Chat Engine](https://docs.llamaindex.ai/en/stable/module_guides/deploying/chat_engines/)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "80TOfQnAxQyt"
      },
      "source": [
        "## Part 4: Chat engine"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qVIScnrjOn8C"
      },
      "source": [
        "Instead of having to ask one question at a time about our Trader Joe’s products for our party, we can go ahead and incorporate a back-and-forth conversation to get the most out of our AI Party Planner.\n",
        "\n",
        "We first need to initialize the chat engine from our `vector_store_index` and enable a streaming response. [Condense question mode](https://docs.llamaindex.ai/en/stable/examples/chat_engine/chat_engine_condense_question/) is also used to ensure that the engine shortens their questions or rephrases them to make the most sense when used in a back and forth conversation. Streaming is enabled as well so we can see the response:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "aoVKscNGyPQ3",
        "outputId": "52ce0461-0460-494e-f571-d3fcf42d5071"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Ask away! Type 'exit' to quit >>> hi! i am planning a fall party\n",
            "\n",
            "\n",
            "Consider including a variety of fall-themed food and beverages such as pumpkin pie, apple cider donuts, maple-flavored fudge, pumpkin spiced cookies, and harvest blend herbal tea to create a festive atmosphere for your fall party. Additionally, you could incorporate seasonal decorations like cinnamon brooms, scented candles, and mum plants to enhance the autumn ambiance.\n",
            "\n",
            "Ask away! Type 'exit' to quit >>> i want to make a turkey, which three sides with prices and reasonings will be best\n",
            "\n",
            "\n",
            "The best three side dishes to serve with turkey at a fall party would be Cut Butternut Squash, Brussels Sprouts, and Cornbread Stuffing. Cut Butternut Squash and Brussels Sprouts are reasonably priced at $3.99 and $4.99 respectively, offering a balance of flavors and textures that complement the turkey well. Cornbread Stuffing, priced at $5.99, adds a traditional touch to the meal and enhances the overall fall-themed dining experience.\n",
            "\n",
            "Ask away! Type 'exit' to quit >>> which drinks should i serve? i want something caffinated \n",
            "\n",
            "\n",
            "Harvest Blend Herbal Tea and Autumn Maple Coffee would be ideal caffeinated drinks to serve at a fall party to complement the autumn-themed food and create a festive atmosphere.\n",
            "\n",
            "Ask away! Type 'exit' to quit >>> what are the prices of these drinks\n",
            "\n",
            "\n",
            "$2.49 for Harvest Blend Herbal Tea and $8.99 for Autumn Maple Coffee.\n",
            "\n",
            "Ask away! Type 'exit' to quit >>> which decor should i use? i want my home to smell nice\n",
            "\n",
            "\n",
            "Cinnamon Whisk, Cinnamon Broom, Orange & Spice Scented Candle & Room Spritz\n",
            "\n",
            "Ask away! Type 'exit' to quit >>> what are the prices?\n",
            "\n",
            "\n",
            "$5.99, $1.29, $4.99\n",
            "\n",
            "Ask away! Type 'exit' to quit >>> exit\n",
            "Exiting chat. Have a happy fall!\n"
          ]
        }
      ],
      "source": [
        "# llamaindex chat engine\n",
        "chat_engine = vector_store_index.as_chat_engine(\n",
        "    chat_mode=\"condense_question\", streaming=True\n",
        ")\n",
        "\n",
        "while True:\n",
        "    # ask question\n",
        "    question = input(\"Ask away! Type 'exit' to quit >>> \")\n",
        "\n",
        "    # exit to quit\n",
        "    if question == \"exit\":\n",
        "        print(\"Exiting chat. Have a happy fall!\")\n",
        "        break\n",
        "\n",
        "    print(\"\\n\")\n",
        "\n",
        "    # llamaindex ask\n",
        "    response_stream = chat_engine.stream_chat(question)\n",
        "\n",
        "    # llamaindex print\n",
        "    response_stream.print_response_stream()\n",
        "    print(\"\\n\")"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "authorship_tag": "ABX9TyPiD/vY70Az64rBGvSx+n6p",
      "include_colab_link": true,
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    },
    "widgets": {
      "application/vnd.jupyter.widget-state+json": {
        "0c58378f5668408bb97d60daaa68d458": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "1719a53ee20243bd922c73a4431ebb20": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "171aaf7fb30946c396f09025f8b1c304": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_20da10bc61594334a5e299d4caae4586",
            "placeholder": "​",
            "style": "IPY_MODEL_d7e04d7dd43647be9133dc7b9d70694a",
            "value": " 1/1 [00:00&lt;00:00, 31.13it/s]"
          }
        },
        "20da10bc61594334a5e299d4caae4586": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "2d0edfb9a3564e3da5f78ae1787b0bfc": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "3af0e639e4f5495e9a4b1c71f79d9cc5": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "511ce4303f49467caeea0eee7214ce88": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "5e819550a7154f5fbc818aadfd487561": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_e543fb1594b4495fadbe4966874fceb5",
              "IPY_MODEL_b1c7a41c18fd4dcea8b47fdee95870ee",
              "IPY_MODEL_be66c6821b264181830a7f449a2cc26c"
            ],
            "layout": "IPY_MODEL_c20bda0c76444c56ba6a8603595912c4"
          }
        },
        "65985a39477a4e9b957706278b6e42d5": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_511ce4303f49467caeea0eee7214ce88",
            "placeholder": "​",
            "style": "IPY_MODEL_7d10d5a0e7ec45c2bb6b19465259af70",
            "value": "Parsing nodes: 100%"
          }
        },
        "7d10d5a0e7ec45c2bb6b19465259af70": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "8ab80dc15e0240f39c6f0d86e731b759": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "9a79752d1d644dc0ba4c6280532585f3": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "abb994db236e4786b97a07d231473dd0": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_65985a39477a4e9b957706278b6e42d5",
              "IPY_MODEL_d2be4f26b70240eeba464db8310aedde",
              "IPY_MODEL_171aaf7fb30946c396f09025f8b1c304"
            ],
            "layout": "IPY_MODEL_e27576e3243a4e6fb5a342a66f8e3c1b"
          }
        },
        "af723eee69994e02a73cc0ce6b75ff93": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "b1c7a41c18fd4dcea8b47fdee95870ee": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_2d0edfb9a3564e3da5f78ae1787b0bfc",
            "max": 2,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_9a79752d1d644dc0ba4c6280532585f3",
            "value": 2
          }
        },
        "be66c6821b264181830a7f449a2cc26c": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_3af0e639e4f5495e9a4b1c71f79d9cc5",
            "placeholder": "​",
            "style": "IPY_MODEL_8ab80dc15e0240f39c6f0d86e731b759",
            "value": " 2/2 [00:00&lt;00:00,  3.11it/s]"
          }
        },
        "c20bda0c76444c56ba6a8603595912c4": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "d2be4f26b70240eeba464db8310aedde": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_af723eee69994e02a73cc0ce6b75ff93",
            "max": 1,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_e1524470f5594ae69784be51402b2991",
            "value": 1
          }
        },
        "d7e04d7dd43647be9133dc7b9d70694a": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "e1524470f5594ae69784be51402b2991": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "e27576e3243a4e6fb5a342a66f8e3c1b": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "e543fb1594b4495fadbe4966874fceb5": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_1719a53ee20243bd922c73a4431ebb20",
            "placeholder": "​",
            "style": "IPY_MODEL_0c58378f5668408bb97d60daaa68d458",
            "value": "Generating embeddings: 100%"
          }
        },
        "state": {}
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
